Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

future: support engine create shell with aysnc mode #18047

Closed

Conversation

lucky-chen
Copy link

@lucky-chen lucky-chen commented Apr 30, 2020

this is a split PR, supprot create shell with aysnc mode , you can see full contex on #17192

Now create shell will block platform thread. The cost of blocking the main thread is very expensive on android/ios platform.

core modify method is CreateShellOnPlatformThread . Other code are

  • new async api for create shell
  • compatible with existing APIs
  • unittests
  • benchmark

@chinmaygarde

Comment on lines 238 to 239
static void Create(
ShellCreateCallback callback,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback I had on the other issue. The engine assumes the shell is initialized synchronously everywhere. I don't support any change that relies on callbacks instead of std::future's to guarantee safety.

Copy link
Author

@lucky-chen lucky-chen May 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi,I've tried to use promise/future before, but there are serious problesm in this case

  • first , it may disturb the sequence ( shell create )
    • There are many thread switches in the creation sequence when create shell,if future.get() called(block thread) on platform/ui/gpu/io thead,may disturb the timing
  • second,will bring burden to the user
    • android/ios must calledfuture.get() on newThread(can't be io/ui/gpu/platform)
  • finally ,May be cause deadlock if user called future.get() on platformthread 。Because we need post a task to platformthread to excute shell.setup after engine created
//demo code on PlatformThread
auto future = Shell::Create(...);
future.get(); 

//shell.cc   (after review and modify)
//now on ui thread
auto engine_ref = std::make_unique<Engine>(... );
fml::TaskRunner::RunNowOrPostTask(
  shell->GetTaskRunners().GetPlatformTaskRunner(), fml::MakeCopyable([]() mutable{
      //dead lock  (setupmust called on platformthread)
      shell->Setup(...);       
 }));

  • For safety, static method Create will guarantee shell full initialized , the callback will be called after all init sequence/setp excuted. So,In my understanding,the shell should be safe.

Looking forward to your reply

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate your consideration and thought put into this =)

it may disturb the sequence

What you are referring to is the synchronization that the future guarantees. The alternative is a race condition that may crash sometimes. The worst kind of bug to track down.

will bring burden to the user

This is not really a burden.

May be cause deadlock if user called future.get() on platformthread

This is a risk but it doesn't go away if you use callbacks instead of futures. If you are in a position where you would get a deadlock with futures, you will have an uninitialized variable and will have to abort the operation somehow, or you erroneously went down a code path prematurely. You can mimic the checking of an unfinished future by using wait_for.


I think your aversion to using futures stems from unfamiliarity and not a solid technical basis. All the problems you have with futures exist with callbacks, they are just harder to verify and thus easier to convince yourself they are correct, more fragile, and harder to debug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liyuqian @lucky-chen I created some sample code that duplicates the deadlock predicament to play around with: https://gist.github.com/gaaclarke/84caf13aa0da0a7a2198247e2e6c8cd0

I think the easiest solution is to remove the dispatch to the platform thread. Just do those calculations up front on the platform thread or if the calculations depend on previous results create a partial result like this:

class ShellHolder {
  private:
    unique_ptr<Shell> _shell;
    std::future<PartialShell> _partial_shell;
  public:
    Shell* get() {
      if (_shell) {
         return _shell;
      }
      _shell = CompleteShell(_partial_shell.get());
      return _shell;
    }
};

Copy link
Author

@lucky-chen lucky-chen May 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaaclarke Thanks reply. Other suggestions are very good, but I still hold different views on callback/promise。

Suppose we have 2 method

  • promise/future std::future<std::unique_ptr<Shell>> Create(...)
  • callback void Create( callback)

when init on android /ios

use future

//on platformThread
void init(){
   auto future = create(...);

   //usage 1  
   future.get(); // bad code, The cost of blocking the main thread is very expensive on android/ios . 
                        //maybe disturb the sequence of createshell ,such as deadlock


   //usage2 check status
   if(isFutureReady(future)){
      //balaba
   }else{
    // called launchFlutter() on some point (such as start flutterActivity)
   }
}

void launchFlutter(){
    future.get();  // if ready, fine . else  block (bad)
    //balabala
}
bool isFutureReady(std::future& f){...}

use callback

but with callback , very simple and users have almost no burden

void init(){
   Shell::Create([]( std::unique_ptr<Shell> shell ){
        //on platformthread
        // no burder , no deadlock
        // will non't disturb the sequenc of createshell 
        shell_ = std:move(shell);
        //balabala
   })
}

As api user, only care about result output. Shouldn't to deal with some additional state. These states should be handled by api self.

in this case , Either return the shell(full init) or don't return。Return future is not a good way in this scene because the future carries intermediate states related to internal logic of api,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaaclarke @lucky-chen : I believe the problem of C++ std::future here is the lack of std::future::then. (We all get used to Future.then in Dart and I'm surprised that C++ doesn't have the counterpart.) The example above would be easy with std::future if the current experimental "then" can be used:

auto future = create(...);
future.then(callback);

Without then, I believe the best we can do now with future is std::async:

auto future = create(...);
std::async(std::launch::async, [...](){
  future.get();
  callback(...);
});

It ensures that the waiting happens on another thread (other than the platform thread) so it will not block or dead lock.

Note that the current callback implementation in this PR may also have problem: it's unclear what thread will execute the callback. Currently the UI thread executes it and avoids block and dead lock. But that violates calling WealkPtr in the same thread. If we correctly calls the Shell::Setup and subsequently the callback in the platform thread, then we end up with blocking and dead lock again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liyuqian @gaaclarke According to the suggestion of review, made some changes,may be better

new api is future<ShellHolder> createShellHolder(...)

usage 1 (sync mode):

auto shell_holder_future = createShellHolder();
auto shell_holder = shell_holder_future.get();
auto shell = shell_holder.makeShell();

usage 2 (async mode)

auto shell_holder_future = createShellHolder();
std::async(std::launch::async,[](){
    //wait engine create end
   auto shell_holder = shell_holder_future.get();
   //makeShell method guarantee called setUp on platformThread
   auto shell = shell_holder.makeShell();
  //...
});

for shellHolder.makeShell

  fml::AutoResetWaitableEvent latch;
  std::unique_ptr<Shell> shell_res;
  fml::TaskRunner::RunNowOrPostTask(PlatformTaskRunner,[]() {
        shell_->Setup()
        shell_res = std::move(this->shell_);
        latch.Signal();
      );
  latch.Wait();
return shell_res;

Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Aaron's suggestion to directly let CreateShellOnPlatformThread return a future instead of taking a callback.

Other than that, the biggest issue seems to be the calling of WeakPtrFactory::GetWeakPtr. Although the code document says that it's only safe to be called from the thread that the WeakPtr is created, it seems that none of our unit tests are failing if that's not the case. @chinmaygarde : is there any reason that we haven't already added CheckThreadSafety(); to WeakPtrFactory::GetWeakPtr?

@gaaclarke gaaclarke changed the title future: supprot engine create shell with aysnc mode future: support engine create shell with aysnc mode May 6, 2020
@liyuqian liyuqian added perf: speed Performance issues related to (mostly rendering) speed severe: performance Relates to speed or footprint issues. labels May 7, 2020
@liyuqian
Copy link
Contributor

liyuqian commented May 7, 2020

I wonder if this PR and its following Android PR could fix flutter/flutter#40563 ? For now, we're blocking the Android's main thread during engine initialization and that could explain Skipped XX frames which is often in a magnitude of hundreds of milliseconds.

@gaaclarke
Copy link
Member

@lucky-chen Looks like you are going to have to merge master in order to get the presubmits green.

@liyuqian
Copy link
Contributor

+1 to @gaaclarke 's suggestion. Here's the command that I use to rebase against tip-of-tree master branch engine: git checkout master && git pull --rebase upstream master && git checkout <your_branch> && git rebase master.

@lucky-chen lucky-chen force-pushed the new_flt_deef2663aca_async_engine branch 2 times, most recently from 1034ed9 to 881036e Compare May 18, 2020 09:24
@lucky-chen
Copy link
Author

@liyuqian update to master,but Windows Host Engine and Windows Web Engine still check failed

Copy link
Contributor

@liyuqian liyuqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucky-chen I couldn't figure out why those tests fail either... I'd usually rebase again if I encountered such mysterious failures. I also left some partial comments as I haven't had time to review the full PR yet. Hopefully they're still helpful :)

@lucky-chen lucky-chen force-pushed the new_flt_deef2663aca_async_engine branch from 881036e to 8506bb5 Compare May 20, 2020 07:58
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@lucky-chen lucky-chen force-pushed the new_flt_deef2663aca_async_engine branch from 8506bb5 to e46e18a Compare May 20, 2020 07:59
@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

1 similar comment
@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@googlebot googlebot removed the cla: no label May 20, 2020
Copy link
Member

@gaaclarke gaaclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are getting close. I still have concerns about deadlock mentioned in the comments and I'd like to get ride of is_consumed_ also noted in the feedback.

This PR is very tricky to get right and not the highest priority so thanks for your patience as we get this right.

@lucky-chen lucky-chen force-pushed the new_flt_deef2663aca_async_engine branch 4 times, most recently from 2adc992 to 74c0c2c Compare June 11, 2020 06:34
@gaaclarke
Copy link
Member

Dude, @lucky-chen, make sure you hit the "re-request review" after you've done work like this. I don't get notifications otherwise. I want to see this stuff!

Copy link
Member

@gaaclarke gaaclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good now. This is pretty mad but it looks good. @liyuqian can you verify it for me please, it's complicated enough that it warrants 2 people to take a close look.

@gaaclarke gaaclarke force-pushed the new_flt_deef2663aca_async_engine branch from 74c0c2c to 3800e23 Compare July 9, 2020 00:08
@gaaclarke
Copy link
Member

I rebased this onto master to shake out CI errors with windows.

@gaaclarke
Copy link
Member

@lucky-chen The failure with windows builds is legitimate:

ninja -t msvc -e environment.x64 -- C:\b\s\w\ir\cache\goma\client/gomacc.exe ../../third_party/llvm-build/Release+Asserts/bin/clang-cl.exe /nologo /showIncludes /FC @obj/flutter/shell/common/shell_unittests_common.shell_unittests.obj.rsp /c ../../flutter/shell/common/shell_unittests.cc /Foobj/flutter/shell/common/shell_unittests_common.shell_unittests.obj /Fdobj/flutter/shell/common/shell_unittests_common_cc.pdb
../../flutter/shell/common/shell_unittests.cc(190,3): error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
  std::async(std::launch::async,
  ^~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~

@gaaclarke
Copy link
Member

I reran the outstanding windows problem to see if it was a flake, it doesn't appear to be:

[ RUN      ] ShellTest.VerticesAccuratelyReportsSize
unknown file: error: SEH exception with code 0xc0000005 thrown in the test body.
[  FAILED  ] ShellTest.VerticesAccuratelyReportsSize (19 ms)

This appears to be a memory error which is going to be a pain to track down.

unref_queue_future.get(), //
snapshot_delegate_future.get() //
));
// wait params(io、gpu task end)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be replaced with ,

std::unique_ptr<ShellCreateParams> params) {
PerformInitializationTasks(params->settings);
PersistentCache::SetCacheSkSL(params->settings.cache_sksl);
TRACE_EVENT0("flutter", "Shell::Create");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put TRACE_EVENT0 as the first line of the function and change "Shell::Create" to "Shell::CreateShellHolde"

@@ -283,42 +300,76 @@ std::unique_ptr<Shell> Shell::Create(
const Shell::CreateCallback<PlatformView>& on_create_platform_view,
const Shell::CreateCallback<Rasterizer>& on_create_rasterizer,
DartVMRef vm) {
auto shell_holder_future = Shell::InitShellEnv(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put TRACE_EVENT0("flutter", "Shell::Create"); as the first line of this function.

return shell;
}

std::future<std::unique_ptr<Shell::ShellHolder>> Shell::InitShellEnv(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: InitializeShellHolder or CreateShellHolder might be a better name than InitShellEnv as we usually don't encourage abbreviations like Env.

}

fml::AutoResetWaitableEvent latch;
std::unique_ptr<Shell> shell;
// fml::AutoResetWaitableEvent latch;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove // fml::AutoResetWaitableEvent latch;

fml::AutoResetWaitableEvent latch;
std::unique_ptr<Shell> shell;
// fml::AutoResetWaitableEvent latch;
std::promise<std::unique_ptr<Shell::ShellHolder>> shell_holder_promise;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can std::promise<std::unique_ptr<Shell::ShellHolder>> shell_holder_promise; be put before line 340 so we can save line 343?

@@ -432,6 +493,15 @@ class Shell final : public PlatformView::Delegate,
const Shell::CreateCallback<PlatformView>& on_create_platform_view,
const Shell::CreateCallback<Rasterizer>& on_create_rasterizer);

static std::future<std::unique_ptr<Shell::ShellHolder>> InitShellEnv(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this function.

@@ -423,7 +483,8 @@ class Shell final : public PlatformView::Delegate,

Shell(DartVMRef vm, TaskRunners task_runners, Settings settings);

static std::unique_ptr<Shell> CreateShellOnPlatformThread(
static void CreateShellOnPlatformThread(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document this function especially on how the new shell_holder_promise parameter should be given and used later.

@scutlight
Copy link
Member

I think this PR is quite a obscurity. the ShellHolder is intricate and I don't think that is a good idea to introduce a new complicated
API to users of Shell.
To support asynchronous engine launches, there is an alternation that hides the asynchronous behaviors inside Shell, making it simple for its users. I just new a PR #19641 to implement this idea, would like you guys to have a look at it. @liyuqian @gaaclarke

@zanderso zanderso requested a review from gaaclarke July 23, 2020 21:53
@zanderso
Copy link
Member

@lucky-chen This PR is getting a bit stale. @gaaclarke's and @liyuqian's feedback hasn't been taken into account. A more focused PR will probably be easier to review and land.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes perf: speed Performance issues related to (mostly rendering) speed severe: performance Relates to speed or footprint issues.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants